Graph-based synset induction methods, such as MaxMax and Watset, inducesynsets by performing a global clustering of a synonymy graph. However, suchmethods are sensitive to the structure of the input synonymy graph: sparsenessof the input dictionary can substantially reduce the quality of the extractedsynsets. In this paper, we propose two different approaches designed toalleviate the incompleteness of the input dictionaries. The first one performsa pre-processing of the graph by adding missing edges, while the second oneperforms a post-processing by merging similar synset clusters. We evaluatethese approaches on two datasets for the Russian language and discuss theirimpact on the performance of synset induction methods. Finally, we perform anextensive error analysis of each approach and discuss prominent alternativemethods for coping with the problem of the sparsity of the synonymydictionaries.
展开▼